Dynamic data for knowledge base?
# support
o
I'm trying to setup a knowledge base for a voice agent based off an XML feed that I have (it updates in real time). Currently, I'm grabbing the XML, converting it into plain text and sending a POST to VAPI's API, but it can't seem to accurately organize the data based on the XML elements when in plain text. Theoretically, I could programatically convert the file to any format, but the acceptable formats don't seem to be conducive for organized data. Does anyone have any experience/thoughts on how to reference a dynamic data set when building a voice agent? Or on the best way to structure organized data to serve as a knowledge base?
v
@Okapi It sounds like you're facing a challenge with integrating dynamic XML data into your voice agent's knowledge base. To handle dynamic data sets effectively, you might consider the following approaches: 1. **XML Parsing**: Instead of converting the XML to plain text, you should parse the XML to extract the relevant information. This way, you can maintain the structure and hierarchy of the data, which is crucial for the voice agent to understand and organize the information. 2. **Data Transformation**: After parsing the XML, transform the data into a format that is compatible with VAPI. JSON is often a preferred format for APIs because it's lightweight and easy for machines to parse and generate. 3. **API Endpoint**: Instead of sending the data directly to VAPI, you could create an API endpoint that VAPI can query. This endpoint would handle the XML parsing and transformation, and then serve the structured data to the voice agent in real-time. 4. **Webhooks**: If VAPI supports webhooks, you could set up a webhook to listen for updates from your XML feed. Whenever there's a new update, the webhook can trigger a process to parse and send the relevant data to the voice agent. 5. **Database Integration**: Another approach could be to parse the XML feed and store the structured data in a database. The voice agent could then query the database to retrieve the latest information when needed. 6. **Custom Adapter**: Develop a custom adapter that sits between your XML feed and VAPI. This adapter would be responsible for real-time parsing of the XML feed and formatting the data into a structure that VAPI can understand and use effectively. If you need assistance with parsing XML or setting up any of these solutions, please let me know, and I can provide more detailed guidance or direct you to the appropriate support resources.
You can continue this conversation by mentioning me in the message.
o
@Vapi what is the best format to use for structured data as a knowledge base?
v
@Okapi The best format for structured data as a knowledge base often depends on the system you're using, but generally, JSON (JavaScript Object Notation) is a widely accepted format for several reasons: 1. **Human-readable**: JSON is easy to read and write for humans, which simplifies the process of creating and maintaining a knowledge base. 2. **Machine-readable**: JSON is also very easy for machines to parse and generate, making it efficient for APIs and programming. 3. **Flexibility**: JSON can represent complex data structures, including nested arrays and objects, which is ideal for representing hierarchical data in a knowledge base. 4. **Compatibility**: JSON is supported by most programming languages and platforms, ensuring that your knowledge base can be easily integrated with various systems and technologies. 5. **Lightweight**: JSON is text-based and typically less verbose than other formats like XML, which can lead to faster processing and lower bandwidth usage. 6. **Standardization**: JSON is a standardized format, which means it follows a set of rules that are universally understood and accepted. 7. **API-friendly**: Many web services and APIs use JSON as their communication format, making it a natural choice for knowledge bases that need to be accessed programmatically. While JSON is a great choice, it's important to consider the specific requirements of your voice agent platform (VAPI in this case) and ensure that the format you choose is supported and can be efficiently processed by the system. If VAPI has specific guidelines or preferred formats for knowledge bases, it would be best to adhere to those standards.
Got it. Tagging @User for additional help.
o
@User from what I can see, knowledge bases don't support JSON, so I'm trying to figure out how to store structured data as a knowledgebase
c
I am not sure about just uploading the file, but the KB can use JSON. What I think might work in your case is that is just gets updated.
o
how do I use a JSON file for the KB without uploading it?
c
Look at the KB provider.
a
No, but name the file .txt and the kb will still be able to read it and interpret it, it just won't be able to parse the JSON. If you really REALLY want JSON, then you have to create your own backend kb system or conversational paths, but I wouldn't recommend that, specially with everything moving so fast. Is your concern lack of reliability in answers?
@Okapi
o
thank you, Alejo- that was a very helpful response. Yes that's exactly right... I have a large data set that changes daily, and I'd like the voice agent to be able to reference details around so that it can answer questions about it accurately when/if asked
I've tried using XML data as a .txt file and it was able to reference it, but it wasn't able to accurately match nested elements to the parent element
a
One option is an external database and another is using splits_per_chunk between 3 and 5, you would have to test it
@Chowderr thoughts?
o
I was using Vapi's KB... not sure I understand what you're suggesting
ahhh using an external db is also an option for me, I didn't realize I could connect vapi to a db
a
You would need Make or a server
o
I could do both- is there any existing documentation around connecting vapi to an external db?
a
I don’t know actually, @Chowderr prob know
o
specifically around using that database as a KB?
ok thanks. Will wait to hear back from @Chowderr
a
If I were a betting man, I would say yes
o
that would be amazing and exactly what I need
I'll keep looking around to see what I can find in the meantime... thanks for getting me pointed in the right direction!
c
I have been lost on the custom KB. I had some ideas but have not tested it.
VAPI KnowledgeBase provider is trieve (which is separate from a file). How big (as in numbers is the file)
o
the JSON file is around 500KB. Over 10k lines
@Chowderr is that what you were asking?
c
Yea. Can you paste an object from it?
o
yep one moment
c
this would tech be one record?
o
yep, technically
c
Give me a minute. Just a thoery, but sure it would work
You use make or?
o
ok np. not yet- I'm just getting started on make
up til now we've always just built microservices
c
But you have an account?
o
yep
c
Google Workspace account ot no
o
yep
are you thinking convert each object to a row in google sheets?
c
md or text file with proper chunking on each record (looks like by 50-60)
o
md?
c
I dont even know what it stands for
o
ohhhhh, a markdown file
got it
ok sorry to interrupt whatever you were thinking- would still love to know your thoughts on getting it all into a KB that VAPI could reference when you have a moment
c
not a chance you could share the file huh?
o
the json file?
I'd be happy to- there's nothing sensitive in it
c
Youcan DM it to me if you want
o
ok cool, just shot it over to you
s
@Okapi Sorry for the late response. For dynamic data, you need to use tool calls. For our knowledge base, we only accept plain text data, though you can add markdown formatting if needed. I'm not sure exactly where you're blocked - if you could share the current status of the issue, I'd be happy to help.
c
I will have this solved for him by tomorrow.
a
@Okapi Trieve should handle that just fine...
@Okapi hey man, this is probably your starting point for dynamic KB creation and updating. With the API you should be able to easily update an assistant with new KBs as often as you want.

https://www.youtube.com/watch?v=sQIKNEwGJoE

o
thanks for the help, everyone!
7 Views